Sequence-Learning Algorithm Based on Backward Chaining

نویسندگان

  • Sanjay S. Joshi
  • Benoit Guilhabert
چکیده

This article considers the problem of learning the correct temporal sequence of discrete behaviors from a finite behavior set that will lead to completion of a complex task, using only stochastic reinforcement from the environment. A trial-and-error learning algorithm is proposed that is inspired by backward chaining from the animal training discipline. The procedure is analytically formulated using a serial composition of finite action-set learning automata with delay. Simulation of the proposed algorithm shows that the algorithm does indeed lead to sequence learning. The effect of parametric variation in the magnitude and quality of reinforcement is investigated in both theory and simulation. It is shown that a fundamental tradeoff exists between quality and speed of learning. It is also shown that the algorithm has the ability to learn desirable action sequences among several feasible action sequences through the use of relative rewards, which may be interpreted using the Bellman principle of optimality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Formalizing the PRODIGY Planning Algorithm

The PRODIGY project is primarily concerned with the integration of planning and learning. Members of the PRODIGY research group have developed many learning algorithms for improving planning efficiency and plan quality, and for automatically acquiring knowledge about the properties of planning domains. The details of the PRODIGY planning algorithm, however, have not been described in the litera...

متن کامل

Sequence Learning by Backward Chaining in Synthetic Characters

We present a learning mechanism that allows an autonomous synthetic character to learn sequences of actions from natural interaction with a human trainer. The synthetic character learns from only a handful of training examples, in a realtime and complex environment. Building on an existing framework for training a virtual dog to perform single actions on command and explore its action and state...

متن کامل

Type-2 Fuzzy Hybrid Expert System For Diagnosis Of Degenerative Disc Diseases

One-third of the people with an age over twenty have some signs of degenerated discs. However, in most of the patients the mere presence of degenerative discs is not a problem leading to pain, neurological compression, or other symptoms. This paper presents an interval type-2 fuzzy hybrid rule-based system to diagnose the abnormal degenerated discs where pain variables are represented by interv...

متن کامل

Finding Optimal Derivation Strategies in Redundant Knowledge Bases

A backward chaining process uses a collection of rules to reduce a given goal to a sequence of database retrievals. A \derivation strategy" is an ordering on these steps, specifying when to use each rule and when to perform each retrieval. Given the costs of reductions and retrievals, and the a priori likelihood that each particular retrieval will succeed, one can compute the expected cost of a...

متن کامل

Development of a Learning System for Proving the Congruence of Two Triangles by Supporting ‘Backward Chaining’

In this study, we have designed and implemented a system that supports learners to use backward chaining to solve proof problems about the congruence of two triangles. In addition, we have carried out an evaluation experiment in a public junior high school to measure the learning effect of the system. As a result of hypothesis tests, for students with good pretest scores, the effect of the syst...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Adaptive Behaviour

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2006